OCPBUGS-74970: Fix kubelet certificate wait loop in criometricsproxy.yaml#6125
OCPBUGS-74970: Fix kubelet certificate wait loop in criometricsproxy.yaml#6125aksjadha wants to merge 1 commit into
Conversation
|
Pipeline controller notification For optional jobs, comment This repository is configured in: LGTM mode |
|
@aksjadha: This pull request references Jira Issue OCPBUGS-74970, which is invalid:
Comment The bug has been updated to refer to the pull request using the external bug tracker. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Repository: openshift/coderabbit/.coderabbit.yaml Review profile: CHILL Plan: Enterprise Run ID: 📒 Files selected for processing (3)
🚧 Files skipped from review as they are similar to previous changes (3)
WalkthroughThe PR inverts the kubelet server certificate wait condition across three node configuration templates and updates an arbiter volumeMount path. InitContainers now wait until /var/lib/kubelet/pki/kubelet-server-current.pem exists before proceeding. ChangesKubelet Certificate Readiness Wait
🎯 3 (Moderate) | ⏱️ ~20 minutes Suggested labels: 🚥 Pre-merge checks | ✅ 15✅ Passed checks (15 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: aksjadha The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
|
/jira refresh |
|
@aksjadha: This pull request references Jira Issue OCPBUGS-74970, which is invalid:
Comment DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
/jira refresh |
|
@aksjadha: This pull request references Jira Issue OCPBUGS-74970, which is valid. The bug has been moved to the POST state. 3 validation(s) were run on this bug
Requesting review from QA contact: DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
af45f24 to
ee0dcdd
Compare
|
@aksjadha: This pull request references Jira Issue OCPBUGS-74970, which is valid. 3 validation(s) were run on this bug
Requesting review from QA contact: DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
Fix is working as expected. Before fix, there were multiple restarts of pods as init container was not waiting for file to exist. With fix, init container checking if file exists and checking correct mount path i,e /var/lib/kubelet/, there are no multiple restart. |
… init container's volumeMount to /var/lib/kubelet
ee0dcdd to
78749ce
Compare
|
@aksjadha: all tests passed! Full PR test history. Your PR dashboard. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
Related bug: Fix kubelet certificate wait loop and mount path in criometricsproxy.yaml
The previous condition
[ -n "$(test -e ...)" ]always evaluated to false becausetest -eproduces no stdout output — it communicates via exit code only. So-nalways evaluated to false, causing the loop to exit immediately instead of waiting for the kubelet certificate to appear.The init container mounts the host's /
var/lib/kubeletat/var. So inside the init container, the host's/var/lib/kubelet/pki/kubelet-server-current.pemappears at/var/pki/kubelet-server-current.pem— but the script checks/var/lib/kubelet/pki/kubelet-server-current.pem, which doesn't exist at that path inside the init container.- What I did
[ ! -e /var/lib/kubelet/pki/kubelet-server-current.pem ], which properly loops until the kubelet certificate file exists/var/lib/kubeletfrom/varto match main container so the script's path/var/lib/kubelet/pki/kubelet-server-current.pemresolves correctly.- How to verify it
kube-rbac-proxy-crio-ippod logs in namespaceopenshift-machine-config-operatorto verify the CRI-O metrics proxy init container correctly waits for the kubelet certificate before proceeding- Description for the changelog
criometricsproxy.yamlacross all node roles (arbiter, master, worker)[ -n "$(test -e /var/lib/kubelet/pki/kubelet-server-current.pem)" ]was incorrect. Replaced with the correct condition[ ! -e /var/lib/kubelet/pki/kubelet-server-current.pem ], which properly loops until the kubelet certificate file existsSummary by CodeRabbit